Analyzing A/B Test Results

Table of Contents

Introduction

A/B tests are very commonly performed by data analysts and data scientists.

For this project, We will be working to understand the results of an A/B test run by an e-commerce website. Our goal is to work through this notebook to help the company understand if they should implement the new page, keep the old page, or perhaps run the experiment longer to make their decision.

a. Read in the dataset from the ab_data.csv file and take a look at the top few rows here:

b. Use the cell below to find the number of rows in the dataset.

What is the number of unique users in the dataset?

c. The number of unique users in the dataset.

e. The number of times when the "group" is treatment but "landing_page" is not a new_page.

f. Do any of the rows have missing values?

For the rows where treatment is not aligned with new_page or control is not aligned with old_page, we cannot be sure if this row truly received the new or old page , So I have to delete these rows and store it in new dataframe called df2.

a. How many unique user_ids are in df2?

b. There is one user_id repeated in df2. What is it?

c. Display the rows for the duplicate user_id?

d. Remove one of the rows with a duplicate user_id, from the df2 dataframe.

◆▰▰▰▰▰▰▰▰►►►►►►►►►►►►►▰▰▰►►►►►►►►►►►►▰▰▰▰►►►►►►►►►▰▰▰▰▰▰▰►►►►►►►◆

Part I - Probability

a. What is the probability of an individual converting regardless of the page they receive?

b. Given that an individual was in the control group, what is the probability they converted?

c. Given that an individual was in the treatment group, what is the probability they converted?

d. What is the probability that an individual received the new page?

What is the probability that an individual received the old page?

e. Consider your results from parts (a) through (d) above, and explain below whether the new treatment group users lead to more conversions.

Answer: When we compared the treatment group and control group, we can conclude that the control group are biassed more than the treatment group, This is especially true when we subtract the means of the two groups and discover that the control group has the upper hand.

◆▰▰▰▰▰▰▰▰►►►►►►►►►►►►►▰▰▰►►►►►►►►►►►►▰▰▰▰►►►►►►►►►▰▰▰▰▰▰▰►►►►►►►◆

Part II - A/B Test

considering I need to make the decision just based on all the data provided. If I want to assume that the old page is better unless the new page proves to be definitely better at a Type I error rate of 5%, what should my null and alternative hypotheses be? I will state my hypothesis in terms of words or in terms of $p_{old}$ and $p_{new}$, which are the converted rates for the old and new pages.

$$H_0: p_{new} \leq p_{old}$$

$$H_1: p_{new} > p_{old} $$

$$ \alpha = 0.05 $$

Assuming under the null hypothesis, $p_{new}$ and $p_{old}$ both have "true" success rates equal to the converted success rate regardless of page - that is $p_{new}$ and $p_{old}$ are equal. Furthermore, assuming they are equal to the convertedrate in ab_data.csv regardless of the page.

Using a sample size for each page equal to the ones in ab_data.csv.

Performing the sampling distribution for the difference in converted between the two pages over 10,000 iterations of calculating an estimate from the null.

a. What is the conversion rate for $p_{new}$ under the null hypothesis?

b. What is the conversion rate for $p_{old}$ under the null hypothesis?

c. What is $n_{new}$ , the number of individuals in the treatment group?

d. What is $n_{old}$ , the number of individuals in the control group?

e. Simulate Sample for the treatment Group
Simulate $n_{new}$ transactions with a conversion rate of $p_{new}$ under the null hypothesis. Store these $n_{new}$ 1's and 0's in the new_page_converted numpy array.

f. Simulate Sample for the control Group
Simulate $n_{old}$ transactions with a conversion rate of $p_{old}$ under the null hypothesis. Store these $n_{old}$ 1's and 0's in the old_page_converted numpy array.

g. Find the difference in the "converted" probability $(p{'}_{new}$ - $p{'}_{old})$ for your simulated samples from the parts (e) and (f) above.

h. Sampling distribution
Re-create new_page_converted and old_page_converted and find the $(p{'}_{new}$ - $p{'}_{old})$ value 10,000 times using the same simulation process you used in parts (a) through (g) above.


Store all $(p{'}_{new}$ - $p{'}_{old})$ values in a NumPy array called p_diffs.

i. Histogram
Plot a histogram of the p_diffs. Does this plot look like what you expected? Use the matching problem in the classroom to assure you fully understand what was computed here.

j. What proportion of the p_diffs are greater than the actual difference observed in the df2 data?

Plotting p_diffs reveals that its have a normally distributed sample distribution.

k. Please explain in words what you have just computed in part j above.

Answer: 1. Based on the assumption that the null hypothesis is correct, I've computed the probability of obs_diff, or a more extreme result (in favour of the alternative hypothesis), Which is called as the probability of the sample distribution of differences known as P-Value.

2. Since this P-value we just calculated is greater than our designated Type I error level of $\alpha$ = 0.05 which is statistically significant, That means we do not have any sufficient evidence to accept the alternative hypothesis over the null hypothesis **(Fail to Reject Null)** ,Resulting that the conversion rate difference between the old page and the new page is less than or equal to 0 ($p_{new} - p_{old} \leq 0$) Or The old page's conversion rate is either greater than or equal to the new page's conversion rate. ($p_{new} \leq p_{old}$)

l. Using Built-in Methods for Hypothesis Testing
We could also use a built-in to achieve similar results. Though using the built-in might be easier to code, the above portions are a walkthrough of the ideas that are critical to correctly thinking about statistical significance.

m. Now use sm.stats.proportions_ztest() to compute your test statistic and p-value. Here is a helpful link on using the built in.

n. What do the z-score and p-value you computed in the previous question mean for the conversion rates of the old and new pages? Do they agree with the findings in parts j. and k.?

Answer: the z-score and p-value obtained above imply that there is no statistical evidence to support the modification to the webpage.

Also P-value is greater than 0.05 which means Fail to reject the null (either the old page has a greater conversion rate than the new one, or both pages have similar conversion rates. Same as the result I got from Sampling Distributions.

Also In the zscore interpretation,z_score is found less than z_alpha=1.64 for a confidence interval of 95%, we fail to reject the null.

◆▰▰▰▰▰▰▰▰►►►►►►►►►►►►►▰▰▰►►►►►►►►►►►►▰▰▰▰►►►►►►►►►▰▰▰▰▰▰▰►►►►►►►◆

Part III - A regression approach

In this final part, We will see that the result I acheived in the previous A/B test can also be acheived by performing regression.

a. Since each row in the df2 data is either a conversion or no conversion, what type of regression should you be performing in this case?

Answer: Since it requires determining which group the given dataset belongs to (conversion or no conversion), a logistic regression would be the best type of regression in this situation.

b. The goal is to use statsmodels library to fit the regression model you specified in part a. above to see if there is a significant difference in conversion based on the page-type a customer receives. However, you first need to create the following two columns in the df2 dataframe:

  1. intercept - It should be 1 in the entire column.
  2. ab_page - It's a dummy variable column, having a value 1 when an individual receives the treatment, otherwise 0.

c. Use statsmodels to instantiate your regression model on the two columns you created in part (b). above, then fit the model to predict whether or not an individual converts.

d. Provide the summary of your model below, and use it as necessary to answer the following questions.

e. What is the p-value associated with ab_page?
Why does it differ from the value you found in Part II?

Answer
1. The P-Value asssociated with ab_page = 0.190.
2. Because I used one-tailed test in the hypothesis testing which made in A/B test ($p_{new} \leq p_{old}$), ($p_{new} > p_{old}$),
On the other hand the logistic regression use two-tailed test in its hypothesis tesing ($p_{new} = p_{old}$),($p_{new} \neq p_{old}$).

f. Now, you are considering other things that might influence whether or not an individual converts. Discuss why it is a good idea to consider other factors to add into your regression model. Are there any disadvantages to adding additional terms into your regression model?

Answer
◈ Since there is countries.csv, We can make a good use of it by adding it to our regression model and look if the country has an influence whether or not a client converts.
◈ If there is a correlation, the added variable will unquestionably have a negative impact on the model's overall quality. However, if the is no correlation, the added variable will help make the model more comprehensible.
◈ Adding many parameters into the regression model is multicollinearity. Multicollinearity occurs when two or more parameters are collinear with each other.

g. Adding countries
Now along with testing if the conversion rate changes for different pages, also add an effect based on which country a user lives in.

  1. You will need to read in the countries.csv dataset and merge together your df2 datasets on the appropriate rows. You call the resulting dataframe df_merged. Here are the docs for joining tables.

  2. Does it appear that country had an impact on conversion? To answer this question, consider the three unique values, ['UK', 'US', 'CA'], in the country column. Create dummy variables for these country columns.

    Provide the statistical output as well as a written response to answer this question.

Answer: From the logistic model summary, We got that the P-values for both US and UK are greater than our designated Type I error level of $\alpha$ = 0.05 which is statistically significant, According to statistics, neither the website nor the country, appear to affect how many users convert.

h. Fit your model and obtain the results
Though you have now looked at the individual factors of country and page on conversion, we would now like to look at an interaction between page and country to see if are there significant effects on conversion. Create the necessary additional columns, and fit the new model.

Provide the summary results (statistical output), and your conclusions (written response) based on the results.

Answer: As we can see that all the P-values for the added parameters are greater than our designated Type I error level of $\alpha$ = 0.05 which is statistically significant, Which leads that the country, the website, and the interaction of these two factors do not seem to affect how many users convert.

◆▰▰▰▰▰▰▰▰►►►►►►►►►►►►►▰▰▰►►►►►►►►►►►►▰▰▰▰►►►►►►►►►▰▰▰▰▰▰▰►►►►►►►◆

Visualization

▣ As we can see there no much difference between the new page conversions and the old page conversions.

▣ We see that the treatment group have more individuals than the control group but still didnt make any difference in our test.

◆▰▰▰▰▰▰▰▰►►►►►►►►►►►►►▰▰▰►►►►►►►►►►►►▰▰▰▰►►►►►►►►►▰▰▰▰▰▰▰►►►►►►►◆

Conclusions

◍ There is no evidence that the new page would increase the e-commerce company's conversion rate, according to the data we calculated in the A/B test and logistic regression (P-values are greater than our designated Type I error level of $\alpha$ = 0.05) , Maybe because this test performed only in 22 days and if the duration was longer it might would have affect our testing results.

◍ I would advise the business to continue using the old page while considering improving their website from time to time that would favourably affect the e-commerce company's conversion rate, and then another A/B test could be performed to determine if these improvements increased the conversion rate or not.

◆▰▰▰▰▰▰▰▰►►►►►►►►►►►►►▰▰▰►►►►►►►►►►►►▰▰▰▰►►►►►►►►►▰▰▰▰▰▰▰►►►►►►►◆

Resources

JOIN
Linear Regression Vs Logistic Regression